SYDE 556/750: Simulating Neurobiological Systems

Terry Stewart

Online Learning

What do we mean by learning?
- When we use an integrator to keep track of location, is that learning?
- Probably not
- What about the learning used to complete a pattern in the Raven's Progressive Matrices task?
- Less clear
We'll stick with a simple definition of learning
- Changing connection weights between groups of neurons

Why might we want to change connection weights?
This is what traditional neural network approaches do
- Change connection weights until it performs the desired task
- Once it's doing the task, stop changing the weights
But we have a method for just solving for the optimal connection weights
- So why bother learning?

We might not know the function at the beginning of the task
- Example: a creature explores its environment and learns that eating red objects is bad, but eating green objects is good
  - what are the inputs and outputs here?
The desired function might change
- Example: an ensemble whose input is a desired hand position, but the output is the muscle tension (or joint angles) needed to get there
  - why would this change?
The optimal weights we solve for might not be optimal
- How could they not be optimal?
- What assumptions are we making?

If we need new decoders
- Let's solve for them while the model's running
- Gather data to build up our $\Gamma$ and $\Upsilon$ matrices
Example: eating red but not green objects
- Decoder from state to $Q$ value (utility of action) for eating
- State is some high-dimensional vector that includes the colour of what we're looking for
  - And probably some other things, like whether it's small enough to be eaten
- Initially doesn't use colour to get output
- But we might experience a few bad outcomes after red, and good after green
- These become new $x$ samples, with corresponding $f(x)$ outputs
- Gather a few, recompute decoder
  - Could even do this after every timestep
Example: converting hand position to muscle commands
- Send random signals to muscles
- Observe hand position
- Use that to train decoders
Example: going from optimal to even more optimal
- As the model runs, we gather $x$ values
- Recompute decoder for those $x$ values

What do they do?
Incremental learning
- as you get examples, shift the connection weights slightly based on that example
- don't have to consider all the data when making an update
Example: Perceptron learning (1957)
- $\Delta w_j = \alpha(y_d - y)x_i$

Problems with perceptron
- can't do all possible functions
- Just linear functions of $x$
- Is that a problem?

But now a new rule is needed
- Standard answer: backprop
- Same as perceptron for first layer
- Estimate correct "hidden layer" input, and repeat
What would this be in NEF terms?

Remember that we're already fine with linear decoding
- encoders (and $\alpha$ and $J^{bias}$) are first layer of weights, decoders are second layer
- Note that in the NEF, we combine many of these together
We can just use the standard perceptron rule
- as long as there's lots of neurons, and we've initialized them well with the desired intercepts and maximum rates, we should be able to decode
- but, what might backprop do?

just do them both
and have a parameter $S$ to adjust how much of each
$\Delta \omega_{ij} = \kappa \alpha_j a_j (S e_j \cdot E + (1-S) a_j (a_j-\theta))$
http://mindmodeling.org/cogsci2013/papers/0058/paper0058.pdf

Works as well (or better) than PES
- Seems to be a bit more stable, but analysis is ongoing
Biological evidence?
- Spike-Timing Dependent Plasticity

Still work to do for comparison, but seems promising
Error-driven for improving decoders
Hebbian sparsification to improve encoders
- or perhaps to sparsify connections (energy savings in the brain, but not necessarily in simulation)